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Open 



Glycans on mucosal surfaces have an important role in host-microbe interactions. The locus 
encoding the blood-group-related glycosyltransferase $-1,4-N-acetylgalactosaminyltransferase 2 
(B4galnt2) is subject to strong selective forces in natural house-mouse populations that contain a 
common allelic variant that confers loss of B4galnt2 gene expression in the gastrointestinal (Gl) 
tract. We reasoned that altered glycan-dependent intestinal host-microbe interactions may underlie 
these signatures of selection. To determine whether B4galnt2 influences the intestinal microbial 
ecology, we profiled the microbiota of wild-type and B4galnt2-6ei\c\en\ siblings throughout the Gl 
tract using 16S rRNA gene pyrosequencing. This revealed both distinct communities at different 
anatomic sites and significant changes in composition with respect to genotype, indicating a 
previously unappreciated role of B4galnt2 in host-microbial homeostasis. Among the numerous 
B4galnt2-6epen6ent differences identified in the abundance of specific bacterial taxa, we 
unexpectedly detected a difference in the pathogenic genus, Helicobacter, suggesting Helicobacter 
spp. also interact with B4galnt2 glycans. In contrast to other glycosyltransferases, we found that the 
host intestinal B4galnt2 expression is not dependent on presence of the microbiota. Given the long- 
term maintenance of alleles influencing B4galnt2 expression by natural selection and the Gl 
phenotypes presented here, we suggest that variation in B4galnt2 Gl expression may alter 
susceptibility to Gl diseases such as infectious gastroenteritis. 
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Introduction 

Host glycans have been shown to participate 
directly in specific host-microbe interactions 
(Bishop and Gagneux, 2007), and signatures of 
selection observed at the loci of carbohydrate blood 
group-related genes in humans (Calafell et ah, 2008; 
Ferrer-Admetlla et ah, 2009) are most likely the 
result of host-pathogen interactions (Anstee, 2010). 
The Sd(a)/Cad carbohydrate determinant is a 
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polymorphic blood group antigen of unknown 
function expressed on the red cells of 90% of 
human populations (Conte and Serafini-Cessi, 
1991) and detectable in several other human tissues, 
including the intestinal mucosa and kidney in 98% 
of humans (Morton et ah, 1988). The glycosyltrans- 
ferase (3-1,4-N-acetylgalactosaminyltransferase 2 
(B4galnt2) is responsible for catalyzing the last step 
in the biosynthesis of the Sd(a)/Cad antigen by the 
addition of an N-acetylgalactosamine (GalNAc) 
residue via a (3-1,4 linkage to a subterminal galactose 
residue substituted with an oc-2,3-linked sialic acid 
(Lo et al, 2003; Montiel et al, 2003). In the 
intestine, there is evidence of a gradient in GalNAc 
residues conferred by B4galnt2 from the ileum to 
the colon (Robbe et al., 2003). The character of 
intestinal mucus itself also changes with location, 
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both in variation in the types of mucins present 
(Corfield et aL, 2001) and in the balance of 
glycolipids and glycoproteins that may be capable 
of acting as B4galnt2 substrates (Dohi et aL, 1996; 
Kawamura et aL, 2005). Thus, distinct carbohydrate 
mucosal phenotypes are likely to result from 
B4galnt2 expression at different intestinal anatomic 
sites. 

Intestinal expression of B4galnt2 is observed 
throughout vertebrates from fish (Stuckenholz 
et aL, 2009) to humans, including mice. Although 
conservation across species is evidence for a func- 
tional constraint preventing the loss of intestinal 
expression, mice genetically deficient in B4galnt2 
[B4galnt2~ / ~] display no obvious phenotype, gastro- 
intestinal (GI) or otherwise, under specified patho- 
gen-free (SPF) conditions in the laboratory (Mohlke 
et aL, 1999; Johnsen et aL, 2008). 

The absence of intestinal B4galnt2 expression is 
also observed in mice homozygous for a sponta- 
neous cis-regulatory mutation termed Modifier of 
von Willebrand Factor-1 [Mvwfl). Mvwfl specifi- 
cally turns off B4galnt2 expression in the intestinal 
epithelium and instead directs B4galnt2 expression 
in vascular endothelium. Endothelial cell expres- 
sion of B4galnt2 results in aberrant posttranslational 
modification of the blood-clotting glycoprotein, von 
Willebrand factor (VWF), resulting in accelerated 
VWF clearance and low circulating VWF levels 
(Mohlke et aL, 1999), similar to the common human 
bleeding disorder, von Willebrand disease (Sweeney 
et aL, 1990). As in the B4galnt2 knockout mouse, 
loss of intestinal B4galnt2 expression due to Mvwfl 
does not result in an obvious GI phenotype in the 
laboratory. Mvwfl is common among laboratory 
mouse strains (Johnsen et aL, 2008), and is present 
at intermediate frequencies in wild Mus musculus 
domesticus (house mouse) populations, where there 
is strong evidence of both recent and long-term 
selection at the B4galnt2 locus (Johnsen et aL, 2009). 

These observations seem paradoxical: Mvwfl 
confers what would be expected to be a detrimental 
phenotype in M. musculus (a mild bleeding dia- 
thesis), yet the Mvwfl allele is common in wild 
mice. The prevalence of natural murine B4galnt2 
variants is further illustrated in our recent survey of 
multiple wild-mouse populations, in which we 
found loss of B4galnt2 intestinal glycans conferred 
by the Mvwfl allele class to be frequent, but not 
always accompanied by the gain of B4galnt2 
vascular expression (Linnenbrink et aL, 2011). 

We hypothesized that loss of intestinal B4galnt2 
expression is at least in part responsible for the 
striking signatures of selection at the B4galnt2 locus 
in wild house mice. In this study, we sought to 
determine the influence of B4galnt2 expression on 
the resident microbiota throughout the GI tract as 
evidence of an intestinal B4galnt2 phenotype, as 
shifts in the intestinal microbiota are known to 
affect host susceptibility to pathogens and disease 
(Bishop and Gagneux, 2007; Sekirov et aL, 2010; 



Stecher and Hardt, 2010). We performed high- 
throughput 16S rRNA gene profiling at multiple 
intestinal sites in mouse sibling pairs that differed 
only in the presence or absence of B4galnt2 expres- 
sion. In addition, we analyzed B4galnt2 expression 
patterns in germ-free versus conventional mice. 
Here we describe this detailed characterization of 
mouse GI bacterial communities in seven distinct 
locations from the duodenum to the colon and 
provide evidence for a significant effect of B4galnt2 
expression on intestinal bacterial populations. 



Materials and methods 

Animal material and tissue sampling 
All animal protocols (except for the germ-free 
protocols, see below) were approved by the 
University of Michigan University Committee on 
the Use and Care of Animals (UCUCA). C57BL6/J 
animals were purchased from the Jackson Labora- 
tory (Bar Harbor, ME, USA). B4galnt2 knockout 
animals first engineered by Dr John Lowe were 
provided with permission and courtesy of Dr David 
Ginsburg. Genetic background, maternal effect, 
housing conditions, gender and diet were accounted 
for in the following mating scheme: B4galnt2~ / 
"animals bred for more than 20 generations to a 
C57BL6/J background were mated to C57BL6/J 
animals to generate heterozygous B4galnt2 +/ 
"parents. B4galnt2 + / ~ parents were then inter- 
crossed to generate B4galnt2 +/+ and B4galnt2~'~ 
male sibling offspring. Four 'sibpair' cages containing 
at least one B4galnt2 +/+ and one B4galnt2~'~ sibling 
were raised and housed together with standard 
mouse chow in the same room in a specified 
pathogen-free (SPF) animal facility at the same time. 

Studies were performed at 10 weeks of age to 
ensure adequate time for the colonization and 
development of mature, stable microbiota (Rehman 
et aL, 2011). Tissues were harvested by fresh 
dissection of 1 cm of bowel at each of the following 
defined anatomic sites: mid-duodenum (D2), mid- 
jejunum, terminal ileum (ending at the cecal valve), 
cecum and mid-descending colon. With the excep- 
tion of the cecum, luminal contents were separated 
from mucosal specimens by flaying open the bowel, 
washing twice with 0.7 ml ice-cold RNALater 
(Ambion, Carlsbad, CA, USA) and the 1.4-ml 
effluent containing stool pooled and marked as 
'luminal contents'. The bowel was then placed in 
1.4 ml ice-cold RNALater and marked as 'mucosa'. 
For the cecal samples, the cecal pouch was 
dissected and flayed open, the entire specimen 
placed in a 15-ml tube with 4 ml ice-cold RNALater, 
and the specimen gently shaken to remove the cecal 
contents. The mucosa was retrieved with forceps 
and rinsed with an additional 0.7 ml of ice-cold 
RNALater. This rinse was added to the 4-ml stool 
solution and marked 'luminal contents'. The rinsed 
cecal tissue was then placed in 1.4 ml ice-cold 
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RNALater and marked 'mucosa'. Owing to varia- 
bility in the presence of luminal contents in the 
sections of the small intestine, only the contents of 
the cecum and descending colon were analyzed. 
Instruments were cleaned between each anatomic 
site to avoid cross-contamination. 

Germ- free animals 

C57BL/6J animals were raised under conven- 
tional or germ-free conditions at the University of 
Gothenberg. All animal protocols were approved 
by the Research Animal Ethics Committee in 
Gothenburg. At 6 weeks of age, small bowel was 
harvested for Dolichos biflorus agglutinin (DBA) 
lectin histochemistry as described below. 

DBA lectin histochemistry 

DBA lectin is reactive with terminal N-acetyl-D- 
galactosamine (GalNAc) residues and specifically 
detects the Mvwfl switch in B4galnt2 expression 
from the intestine to the blood vessel (Mohlke et al., 
1999). Fresh small bowel harvested from C57BL6/J 
and B4galnt2~ f ~ animals was fixed in Z-fix (Anatech 
Ltd, Battle Creek, MI, USA) at room temperature 
overnight, then paraffin embedded. DBA lectin 
histochemical staining was performed on embedded 
tissues using horseradish peroxidase-conjugated 
DBA (EY Laboratories, Inc., San Mateo, CA, USA) 
as previously described (Mohlke et al., 1999). 

DNA extraction 

Approximately 100 mg of tissue or luminal contents 
was transferred to a 2-ml screwcap tube containing 
glass beads 0.1, 0.5 and 1mm in size, each 50 mg 
(BioSpec Products, Bartlesville, OK, USA), and 
1.4 ml lysis buffer argininosuccinate lyase from 
the QIAmp DNA stool mini kit (Qiagen, Hilden, 
Germany). After bead beating in a Precellys (Peqlab, 
Erlangen, Germany) bead beater (3 x 15 s at 
6500r.p.m.), samples were heated to 95 °C for 
lOmin and constantly shaken in a thermomixer 
(Eppendorf). Bacterial DNA was then extracted 
using the QIAmp DNA stool mini kit (Qiagen) 
following the manufacturer's instructions. 

PCR and pyrosequencing 

Universal bacterial primers for the V3 16S rRNA 
variable region were used to amplify bacterial 16S 
RNA as described by Dethlefsen et al. (2008). PCR 
amplicon primers were fused to molecular identifier 
(MID) tags and 454 sequencing adapters according to 
the manufacturer's instructions (Roche, Basel, Swit- 
zerland). All PCR reactions were performed in 50-|il 
duplicates and combined after PCR. The products 
were extracted with the Qiagen MinElute Gel 
Extraction Kit and quantified with the Quant-iT 
(Invitrogen, Karlsruhe, Germany) dsDNA BR Assay 
Kit on a Nanodrop 3300 fluorometer. Equimolar 
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amounts of purified PCR product from seven knock- 
out and six wild-type individuals were pooled to 
generate libraries for each anatomic location. Pooled 
amplicon libraries were further purified using 
Ampure Beads (Agencourt, Bernried, Germany). A 
sample of each library was run on an Agilent 
(Waldbronn, Germany) bioanalyzer as described in 
the Roche Titanium Amplicon Sequencing protocol 
before entering emulsion PCR and sequencing. Each 
library was sequenced on one-eighth of a picotiter 
plate on a 454 GS-FLX (Roche). 



Sequence analysis 

Raw reads obtained from the sequencer were filtered 
using a perl script according to the following 
criteria: average quality ^25, no ambiguous bases, 
length between 110 and 200 bp, excluding MID tags 
and bacterial primers, and a perfect match to the 
MID and bacterial primer. Primer and tag sequences 
were removed using alignment with the Needle- 
man-Wunsch algorithm (Needleman and Wunsch, 
1970). Sequences were sorted by individuals to 
produce group files for analysis in MOTHUR 
v.1.12.3 (Schloss, 2009). Classification into bacterial 
phyla was performed with the Ribosomal Database 
Project classifier tool (Wang et al., 2007) and the 
RDP taxonomy database as implemented in 
MOTHUR. Reads were aligned with the kmer 
algorithm available under the align. seqs command 
(Schloss, 2009) in MOTHUR to the silva reference 
database (Pruesse et al., 2007). Sequences that did 
not match the reference alignment in the expected 
positions were removed. 



Statistical analysis 

Aligned sequences were used to build a distance 
matrix and group sequences into operational taxo- 
nomic units (OTUs) in MOTHUR. Species richness 
estimates and collector's curves were generated 
based on these OTUs and drawn using the R 
statistics package v. 2. 11.1 (R Development Core 
Team, 2010). Phylogenetic trees were built based 
on the MOTHUR-derived distance matrices with 
FastTree v. 2.1.3 (Price et al., 2009, 2010) and 
submitted to the Fast Unifrac online tool for 
principal coordinate analysis (PCoA) of Unifrac 
distances (Hamady et al., 2009). Linear models were 
fitted to the data using the Tm' function implemen- 
ted in R. Two mixed effect models were made with 
the Tme' function contained in the nlme package v. 
3.1-96 (Pinheiro et al., 2009) to account for potential 
maternal (cage of origin) effects (models 6 and 7 in 
Table 1). To compare species richness between 
different sections of the gut, we subsampled the 
same number of reads per individual and section 
[in silico capping), calculated Chao's richness 
estimator (Chao, 1984) and compared the number 
of OTUs observed in the subsamples. 
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Table 1 Linear model comparisons for the effect of genotype on Unifrac PCos 
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Null model 


16.66 




5.66 




2 


Unifrac PCol 


Gut section 


-145.86 


1.20E-32 


0.69 


0.88 


3 


Unifrac PCo2 


Null model 


-115.81 




1.15 




4 


Unifrac PCo2 


Gut section 


-213.42 


6.87E-20 


0.31 


0.73 


5 


Unifrac PCo2 


Gut section+genotype 


-217.864 


0.016 a 


0.28 


0.07 a 


6 b 


Unifrac PCo2 


Fixed = genotype 


NA 


0.017 


NA 


NA 






Random = section/cage 










7 b 


Unifrac PCo2 


Fixed = genotype 


NA 


0.019 


NA 


NA 






Random = section/cage/genotype 











Abbreviations: Cage, cage of origin (maternal effect); NA, not applicable in that context; section, location along the intestine. 
a ANOVA in comparison with model 4. 
b Linear mixed effects model. 
P-vahies by ANOVA. 



Candidate bacterial species distinguishing the 
genotypes were determined by calculating the point 
biseral correlation between species-level (97% 
sequence identity) OTUs and genotype with the 
'multipart' function as part of the 'indicspecies' 
R package v. 1.5.1 (De Caceres and Legendre, 2009). 
This approach considers not only the difference in 
read number assigned to an OTU between geno- 
types, but also the number of individuals of each 
genotype in which the OTU is observed, thus 
drawing statistical power from biological replicates. 

The number of sequence reads per sample was 
capped in silico by random subsampling to obtain 
equal sample sizes for species richness, Unifrac and 
for detection of candidate species. To provide an 
initial overview of the sequence reads associated 
with each region of the GI tract, we classified 
sequences by aligning them to the SILVA reference 
database (Pruesse et ah, 2007) and obtained taxo- 
nomic information from RDP Classifier (Wang et ah, 
2007). 

Results 

DBA lectin staining 

DBA lectin staining of the small bowel from 
C57BL6/J animals demonstrates strong DBA reacti- 
vity in intestinal epithelial cells, the mucus material 
contained in goblet cells, and the lining of the 
intestinal luminal surface. No DBA reactivity is seen 
in the small bowel of B4galnt2~'~ animals (Figure 1). 
These data show that DBA lectin histochemistry is a 
specific method for detection of B4galnt2-caibohy- 
drates in the bowel, which are evident on the 
intestinal epithelium and mucus of B4galn ^-suffi- 
cient animals but absent in B4galnt2~ / ~ animals. 

High-throughput sequencing of the B4galnt2 + /+ and 
B4galnt2~ / ~ intestinal microbiota 
To determine the influence of B4galnt2 expression 
on the resident microbiota throughout the GI tract, 
we performed high-throughput pyrosequencing 
(454) of the bacterial 16S rRNA gene. Our sibpair 




Figure 1 Small-bowel DBA lectin staining, (a) Wild-type 
(C57BL6/J) intestine exhibits robust intestinal epithelial DBA 
lectin staining (brown). Mucus (arrows indicate mucus-contain- 
ing goblet cells) is strongly DBA lectin positive, (b) B4galnt2 f 
bowel demonstrates complete loss of DBA lectin staining of the 
epithelium and mucus. 

breeding scheme (see Materials and methods) 
yielded a total of six B4galnt2 +/+ and seven 
B4galnt2~ / ~ individuals. We surveyed the bacterial 
communities associated with the mucus layer of 
five major functional compartments of the murine 
intestine (duodenum, jejunum, ileum, cecum and 
descending colon) and the luminal contents of the 
cecum and colon. The 16S rRNA V3 region was 
successfully amplified from all individuals at all 
sites, with the exception of the duodenum of one 
B4galnt2~'~ individual (see Supplementary Table 1 
for distribution of reads per GI tract location). A total 
of 337 449 reads of the hypervariable region V3 were 
analyzed after quality filtering. A single B4galnt2~'~ 
individual displayed a strongly deviant pattern with 
respect to the relative abundance of the major 
bacterial phyla compared with all 12 other mice in 
the small intestine (Supplementary Figure 1), 
despite no notable phenotype at the time of dissec- 
tion. This individual was housed with two other 
siblings, one B4galnt2 + / + and one B4galnt2~ / ~, 
which did not exhibit this pattern. Although we 
cannot exclude that this aberration is a consequence 
of loss of B4galnt2 expression, we speculate that this 
animal was subclinically ill or otherwise compro- 
mised, resulting in a shift distinct from other 
genotypes. Thus, we removed this individual from 
further analyses. 
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Bacterial phyla of the GI tract by anatomic site 
The communities within the distinct sections of the 
GI tract differ largely in their composition and 
proportion of the major bacterial phyla (Figure 2). 
The mucosa of the small intestine (duodenum, 
jejunum and ileum) is dominated by Firmicutes 
(77%, 78% and 91%, respectively). As previously 
reported by Ley et al. (2005), we also detected 
sequences of Cyanobacteria origin in the small 
intestine (1.5% in duodenum and 2.1% in the 
jejunum). The mucosa of the cecum and colon 
contained a comparatively lower proportion of 
Firmicutes (40% and 25%, respectively) and higher 
proportion of Bacteroidetes (19% and 15%, respec- 
tively) and Proteobacteria (37% and 59%, respec- 
tively). The cecum samples also contained 
Deferribacteres (2.5%). 

The differences between the mucosa and the 
adjacent luminal contents of the cecum and colon 
are particularly striking. Cecum and colon mucosal 
tissues contain 19% and 15% Bacteroidetes, com- 
pared with 32% and 64%, respectively, in the 
lumen. The opposite pattern is found for Proteo- 
bacteria, with 37% and 59%, respectively, in the 
cecum and colon mucosal tissues, compared with 
8% and 3% in the lumen. Furthermore, the 
Tenericutes are more abundant in the lumen. This 
remarkable distinction between the mucosal and 
luminal communities might be expected in light of 
previous findings that bacterial populations occupy- 
ing human colonic mucosa are distinct from those 
present in fecal samples (Eckburg et al., 2005; 
Willing et al., 2010). Our results provide direct 
support for the presence of distinct communities 
inhabiting the niches found along the GI tract, 
including those associated with the tissue versus 
lumen at a given location. 



Bacterial species richness differs between sections of 
the GI tract 

To characterize the diversity of these communities at 
lower taxonomic levels, we tested for differences in 
bacterial species richness in the B4galtn2 + /+ and 
B4galnt2~'~ mice at the five mucosal anatomic 



locations of the GI tract and in the luminal contents 
of the cecum and colon (Supplementary Figures 2 
and 3). For this analysis, sequences were grouped 
into species-level (97% similarity) OTUs and Chao's 
species richness estimator was calculated. 

Although we detected no significant difference in 
species richness with respect to B4galnt2 genotype 
in any of the sections, several interesting patterns 
are apparent with regard to the individual GI tract 
locations. Species richness is highest in the luminal 
samples and on average lower in the mucus layer 
(335 versus 187 species, Chao's species richness 
estimator), consistent with the protective function of 
the mucus layer and a more controlled interaction 
between host and microbes closer to the intestinal 
epithelium. The high richness observed in the 
cecum may relate to its role as a biofermenter 
offering stable conditions for the growth of diverse 
species (Savage, 1977). Interestingly, we also 
observe a progressive decline of bacterial species 
richness along the small intestine (201, 168 and 82 
for duodenum, jejunum and ileum, respectively, 
Chao's species richness estimator), mainly due to a 
much lower diversity in the ileum. 

Species composition differs between B4galnt2 + /+ and 
B4galnt2 _/ ~ mice 

To test whether the composition of bacterial com- 
munities is influenced by B4galnt2 expression, 
we compared bacterial communities between 
B4galnt2 +/+ and B4galnt2~ l ~ mice using the phylo- 
geny, Bernried, Germany-based beta-diversity mea- 
sure UniFrac. This metric represents the distance 
between bacterial communities by comparing the 
shared branch length between phylogenetic trees 
underlying two communities. A matrix of UniFrac 
distances between the individual mice was analyzed 
for principal coordinates (PCo) to partition variation 
among samples into the most important indepen- 
dent components (Figure 3a). The resulting variance 
along the principal coordinates was analyzed in a 
linear model framework with respect to the experi- 
mental setup (Figure 3b, Table 1). 

Our model identifies B4galnt2 genotype as a 
significant determinant of PCo2, and hence bacterial 
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Figure 2 Abundance of major bacterial phyla by anatomic site. Average abundance of sequencing reads from major bacteria phyla in + / + 
[B4galnt2 +/+ ] and -/- [B4galnt2 1 ) mice along the intestine in percent of total bacterial sequences obtained, cec.cont, cecum content; 
col.cont, colon content. 
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Figure 3 Principal coordinate analysis (PCoA) of Unifrac distances across anatomic sites and genotypes, (a) PCoA of unweighted 
Unifrac distances for the whole dataset (all anatomic sites sampled). Colors by section, filled symbols = +/ + ; open symbols = — /— ; 
symbol shape by cage of origin, (b) Box plot of PCo2 from a, bold horizontal lines represent the median, the box edges mark the upper and 
lower quartiles, whiskers extend towards the maximum and minimum; color code as in a. 



community composition. Both a linear model 
including section and genotype as additive effects 
(model 5, Table 1) and the application of mixed- 
effects models to account for maternal (cage of 
origin) effects (models 6 and 7) assign genotype as a 
significant explanatory effect of community compo- 
sition. The effect is visualized in Figure 3b: 
The value along PCo2 assigned to B4galnt2 +/ + mice 
is higher for all portions of the GI tract except 
the colon, indicating a systematic effect of B4galnt2 
expression on bacterial communities along the 
GI tract. 

This analysis also reveals the striking effect of 
anatomic location on bacterial composition along 
the GI tract. The individual locations of the gut 
analyzed in our study differ strongly in the compo- 
sition of their associated bacteria, accounting for 
88% and 73%, respectively, of the variance observed 
along the first two PCos. 



Differences in the bacterial community of B4galnt2 + /+ 
and B4galnt2~ / ~ mice by location 
Because distinct locations along the GI tract differ 
largely in their bacterial composition, genotype- 
dependent differences between the bacterial com- 
munities of B4galnt2 + ,+ and B4galnt2~ f ~ mice are 
likely to be site-specific. Thus, to analyze the 
potential influence of B4galnt2 on bacterial com- 
munities independent of variation between sections, 
we applied UniFrac to each anatomic location 
separately. We find the largest genotype effect on 
the bacterial communities associated with the ileum 
(Figure 4, P=7.49xl0" 5 , i? 2 = 0.81, P= 0.001 after 
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Figure 4 Unweighted Unifrac PCoA of the ileal mucosal 
bacterial community. Solid symbols = B4galnt2 + /+ , empty 
symbols = B4galnt2 7 . Symbol shape by cage of origin. 



Bonferroni correction for testing all sections and the 
first two PCos, analysis of variance). 



Individual bacterial species associated with 
B4galnt2 + /+ and B4galnt2 _/ " genotypes 
To shed light on the identity of bacterial species that 
make up some of the differences between the 
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intestinal bacterial communities associated with 
B4galnt2 +/ + and B4galnt2~ / ~ mice, we applied a 
common ecological measure of species habitat 
association (Dufrene and Legendre, 1997) to OTU 
clusters (97% sequence identity). OTUs identified 
as indicators of B4galnt2 expression 'habitat' (that is, 
those consistently present or more abundant in 
one genotype compared with the other) were 
classified using the Ribosomal Database Project tool 
(Wang et al., 2007). Numerous interesting candidate 
taxa belonging to the three major phyla Bacteroi- 
detes, Firmicutes and Proteobacteria were identified 
by this method (Table 2). Members of the Firmicutes 
appear to be the most widely influenced. A total of 
11 indicator OTUs belonging to the classes Clostri- 
diales or Lactobacillales were identified from this 
group, with at least one member differentiating 
B4galnt2 expression habitats in all seven sampled 
locations of the GI tract. The Bacteroidetes con- 
tained eight OTUs distributed across all jejunum, 
cecum and the luminal contents of the cecum and 
colon, five of which belonged to the genus Barne- 
siella. Four OTUs belonging to the Proteobacteria 
were identified in the duodenum, colon and luminal 
contents of the colon, three of which belonged to the 
genus Helicobacter. 



Intestinal expression o/B4galnt2 does not require 
the presence of bacteria 

Expression of intestinal glycosyltransferases, such 
as the Fut2 glycosyltransferase, has been shown to 
be influenced by the microbiota present (Bry et ah, 
1996; Meng et ah, 2007). This regulatory mechanism 
likely has an important role in the host's ability to 
alter the mucosal surface in response to the environ- 
ment. To determine whether B4galnt2 expression 
similarly requires the presence of intestinal bacteria, 



DBA lectin staining was performed on the small 
bowels of mice housed under conventional and 
germ-free conditions. Intestinal DBA lectin staining 
was present and appeared similar under both 
conventional and germ-free conditions (data not 
shown), indicating that B4galnt2 expression occurs 
in the absence of intestinal bacteria. 



Discussion 

At birth the intestinal tract is sterile, but is rapidly 
colonized by a diverse spectrum of bacteria, in 
addition to archaea and eukaryotes. These organ- 
isms are provided a nutrient-rich and largely stable 
environment by the host. In turn, the host relies on 
the microbiota for a variety of metabolic processes 
and their presence is required for normal host 
intestinal development, mucosal integrity and main- 
tenance of immunologic balance (Backhed et al., 
2005; O'Hara and Shanahan, 2006; Artis, 2008; 
Fraser et al., 2009). Furthermore, these complex 
communities protect the host from pathogenic 
organisms in several ways: by occupying microbial 
niches resulting in displacement, by production of 
antimicrobial factors and by competing for nutrients 
and receptors (O'Hara and Shanahan, 2006). By 
systematically profiling multiple locations through- 
out the murine GI tract, we have characterized 
distinct microbial communities at discrete anatomic 
sites throughout the intestine and found striking 
differences between sites and between the mucosa 
and adjacent luminal contents. Although this is 
seemingly in contrast to a previous report that found 
no difference between the populations of the 
mucosa and lumen in a humanized mouse model 
(Turnbaugh et al., 2009), this apparent discrepancy 
may be attributable to the current study being based 



Table 2 Candidate bacterial species as determined by OTU genotype correlation 



Section Genotype OTU r P Seq Taxonomy 



Duodenum 


+/+ 


154 


0.76 


0.02 


6 




+/+ 


134 


0.712 


0.03 


14 




-/- 


37 


0.844 


0.003 


15 




-/- 


63 


0.392 


0.037 


76 


Jejunum 


+/+ 


135 


0.579 


0.039 


171 




-/- 


48 


0.496 


0.048 


22 


Ileum 


-/- 


62 


0.707 


0.015 


8 


Cecum 


+/+ 


595 


0.768 


0.018 


7 




-/- 


37 


0.622 


0.029 


59 




-/- 


103 


0.61 


0.024 


41 




-/- 


110 


0.481 


0.048 


70 


Colon 


+/+ 


123 


0.704 


0.046 


9 




+/+ 


30 


0.597 


0.044 


15 


Cecum cont. 


+/+ 


331 


0.763 


0.012 


44 




+/+ 


499 


0.742 


0.017 


9 




-/- 


192 


0.671 


0.048 


8 




-/- 


105 


0.632 


0.047 


13 


Colon cont. 


+/+ 


23 


0.632 


0.027 


17 




-/- 


192 


0.667 


0.046 


12 



Campylobacterales (100); Helicobacteraceae (100); Helicobacter (100) 
Lactobacillales (100); Streptococcaceae (100); Lactococcus (100) 
Lactobacillales (100); Streptococcaceae (100); Streptococcus (80) 
Pseudomonadales (100); Moraxellaceae (100); Acinetobacter (100) 
Clostridiales (100); Clostridiaceae (100) 

Bacteroidales (100); Porphyromonadaceae (100); Barnesiella (100) 

Clostridiales (100); Clostridiaceae (100) 

Clostridiales (100); Lachnospiraceae (100); Moryella (86) 

Clostridiales (100); Lachnospiraceae (100);Moryella (100) 

Bacteroidales (100); Porphyromonadaceae (100) 

Bacteroidales (100); Porphyromonadaceae (100); Barnesiella (100) 

Lactobacillales (100) 

Campylobacterales (100); Helicobacteraceae (100); Helicobacter (100) 
Clostridiales (100) 

Clostridiales (100); Lachnospiraceae (100) 

Clostridiales (100); Lachnospiraceae (100); Syntrophococcus (88) 
Bacteroidales (100); Porphyromonadaceae (100); Barnesiella (100) 
Bacteroidales (100); Porphyromonadaceae (100); Barnesiella (100) 
Bacteroidales (100); Porphyromonadaceae (100); Barnesiella (100) 



Abbreviation: OTU, unique identifier for the respective OTU (artificial species name); sequence number calculations are based on taxonomy; 
class, family, genus are listed according to RDP taxonomy; bootstrap values are shown in brackets. 
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on sampling of unperturbed native bacterial 
communities. 

We find that variation in the expression of a single 
glycosyltransferase, B4galnt2, is associated with 
significant shifts in the composition of the intestinal 
microbiota. These differences are consistent with 
host carbohydrate-specific selection on colonizing 
microbial populations, as B4galnt2~ / ~ animals and 
their control B4galnt2 +/+ littermates shared the 
same microbe exposures over their lifetime. The fact 
that the ileum displayed the most clear separation in 
overall composition as measured by the unweighted 
UniFrac metric also suggests B4galnt2-de^pendent 
immune activity, as strong antibacterial activity is 
present in this tissue (Petnicki-Ocwieja et aL, 2009). 

In addition to overall changes in composition, we 
identify a number of specific bacterial lineages 
influenced by host B4galnt2 expression. Although 
it could be argued that many bacterial orders found 
in the bowel have at least one member that has been 
shown to be important for gut health, nearly all the 
candidate lineages we identified as being influenced 
by B4galnt2 expression have been previously iden- 
tified as significant to intestinal communities. For 
example, we discovered differences in several OTUs 
belonging to the order Clostridiales in the jejunum, 
ileum, cecum and cecum content, similar to two 
recent reports that found distinct Clostridiales OTUs 
in inflammatory bowel disease (IBD) samples com- 
pared with the controls (Willing et aL, 2010; Frank 
et aL, 2011). Furthermore, we observe differences in 
the family Lachnospiraceae (Order: Clostridiales), 
which was identified to be decreased in patients 
with Crohn's disease localized to the ileum com- 
pared with controls (Willing et aL, 2010) and also 
reported to be disparate between healthy and 



diseased mice in an IL10 knockout model of IBD 
(Ye et aL, 2008). Interestingly, Lachnospiracaea are 
major determinants of the recently reported 'enter- 
otypes' described in a large study of human fecal 
metagenomes (Arumugam et aL, 2011). Ye et aL 
(2008) also identified differences in their murine 
IBD model in the genus Barnesiella, which we too 
found to be influenced by B4galnt2 genotype. Others 
have associated Barnesiella with CD8 + T-cell 
function in mice (Presley et aL, 2010), suggesting 
more than a bystander role for this genus in 
intestinal inflammation. 

Intriguingly, we found that the indicator OTUs for 
B4galnt2 +/+ and B4galnt2~'~ mice at the same 
location of the gut often were members of the same 
bacterial family or genus (Figure 5). Streptococcus 
and Lactococcus were indicative of B4galnt2 +/+ and 
B4galnt2~ / ~ mucosal communities in the duodenum, 
as were two different Moryella OTUs in the cecum. 
Likewise, closely related indicator OTUs distin- 
guished by B4galnt2 genotype were found in the 
luminal content of cecum and colon (Lachnospi- 
raceae and Barnesiella, respectively). Thus, similar 
species appear to substitute each other depending 
on B4galnt2 genotype, indicating that many closely 
related species have the potential to occupy distinct 
B4galnt2 glycan-defined niches in the mucosa. 

Unexpectedly, among the bacteria identified were 
several OTUs belonging to the genus Helicobacter. 
Helicobacter spp. are known to naturally infect and 
cause disease in laboratory mice, most commonly 
H. hepaticus, H. bilis and H. typhlonius (Feng et aL, 
2005). Although a difference in Helicobacter abun- 
dance would be expected to result in an altered risk 
for enteritis or other hepatobiliary inflammatory 
processes, in a protected laboratory environment 



a 






0.018 




0.016 


c 




CD 
O 


0.014 


dar 


0.012 


£Z 
=3 


0.01 


-Q 




< 

CD 


0.008 


> 


0.006 


CC 




CD 


0.004 


CC 






0.002 




0 



Duodenal Mucosa 



Cecal Mucosa 



I OTU 134.1M 
(Lactococcus) 

I OTU 37.1 M 
(Streptc 



i- C\J CO LO 



i- CM CO LO CD 



0.008 
0.007 
0.006 
0.005 
0.004 
0.003 
0.002 
0.001 
0 



± 



■ OTU 37.5M 
(Moryella) 



i- C\J CO -<t LO CD . 

oooooo-^-e 



CM CO ^1" LO CO 
% % % % % % 



C 0.018 
0.016 

CD 

g 0.014 
f 0.012 

E o.oi 

0 0.008 

1 0.006 
£ 0.004 

0.002 
0 



Colon contents 



1 



I OTU 23.6S 
(Barnesiella) 



1 



i-CMCO^f-LOCD-i-CMCO^I-LOCO 



a 0.006 

0 0.005 

o 

-g 0.004 
c 

1 0.003 
o 

■M 0.002 
as 

IT 0.001 
0 



Cecal Contents 



ml mi 



■ OTU 499.5S 
(Lachnospiraceae) 

■ OTU 192.5S 
(Lachnospiraceae) 



CMCO^I-LOCD-i-CMCO^I-LOCD 

22222%%%%%% 



Figure 5 Individual bacterial species associated with B4galnt2 genotype. OTUs of closely related species appear to substitute by 
B4galnt2 genotype [B4galnt2 + /+ =wt, B4galnt2- /- =ko). (a) Duodenal mucosa: OTU 134. 1M [Lactococcus sp.), OTU 37. 1M 
[Streptococcus sp.), (b) cecal mucosa: OTU 595. 5M [Moryella sp.), OTU 37. 5M [Moryella sp.), (c) colon contents: OTU 23. 6S [Barnesiella 
sp.), OTU 192. 6S [Barnesiella sp.), (d) cecal contents: OTU 499.5S (Lachnospiraceae), OTU 192. 5S (Lachnospiraceae). 
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animals with Helicobacter may exhibit a more subtle 
or subclinical phenotype. In humans, although 
H. hepaticus and H. bilis are both associated with 
hepatoenteric disease (Goldman and Mitchell, 
2010), the most prevalent and best-studied patho- 
genic Helicobacter is H. pylori. 

H. pylori predominantly inhabits the gastric mucosa 
and is associated with a spectrum of human 
intestinal diseases ranging from gastritis to malig- 
nancy (Timothy and Martin, 2009). Most H. pylori is 
located within the overlying gastric mucus layer and 
does not interact with the underlying epithelium. 
However, under some conditions H. pylori can 
adhere to the gastric mucosa, triggering virulence 
factors in the bacterium and an inflammatory 
response from the host (Timothy and Martin, 
2009). Thus, the mere presence of Helicobacter 
may result in a bystander effect by which the 
nearby-resident microbiota may be influenced not 
only by niche competition with Helicobacter but 
also by more general environmental changes due to 
mucosal inflammation and upregulation of host 
defenses. 

Adhesion to host glycans presented on the gastric 
mucosa, notably carbohydrate antigens of the ABH 
and Lewis blood group systems, is a critical step in 
the pathogenesis of H. pylori (Kobayashi et ah, 2009). 
Furthermore, H. pylori is known to be capable of 
specifically binding additional carbohydrate moieties, 
including terminal (3-1,3-GalNAc residues (Miller- 
Podraza et ah, 2005), while other murine entero- 
hepatic Helicobacter spp. isolates also demonstrate 
evidence of carbohydrate-specific adhesion (Hynes 
et ah, 2003). In the duodenal and colonic mucosa, two 
indicator Helicobacter spp. were detected in the 
B4galnt2 + / + animals, yet these Helicobacter were 
largely undetectable in B4galnt2—/— individuals 
(Figure 6). Thus, we postulate a novel direct interac- 
tion between Helicobacter and host mucosal 
B4galnt2- derived (3-1,4-GalNAc residues to be a likely 
mechanism responsible for the significant increase in 
abundance of these Helicobacter species observed in 
the B4galnt2 + / + mice. 

The conservation of intestinal B4galnt2 expres- 
sion across species suggests an important functional 
role for B4galnt2 in the GI tract. This hypothesis is 
supported by our finding that the loss of B4galnt2 
expression significantly impacts the composition of 
the resident microbiota. We propose that the com- 
plex signatures of natural selection observed in 
house mice (Johnsen et ah, 2009) are at least in part 
due to the variation of B4galnt2-GalNAc residues on 
intestinal mucosal surfaces (the loss of glycans 
during evolution is discussed by Bishop and 
Gagneux (2007)). Taken together, these data support 
a scenario in which the loss of intestinal B4galnt2 
expression offers a significant fitness advantage in 
the face of pathogens reliant on the presence of the 
otherwise ubiquitous intestinal mucosal B4galnt2- 
derived GalNAc. Host B4galnt2 glycans may 
serve as a carbon source for both symbiotic and 
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Figure 6 Abundance of Helicobacter spp. byB4galnt2 genotype. 
Relative abundance of OTUs of two Helicobacter spp. in the 
duodenum and colon segregated by B4galnt2 genotype (indivi- 
duals are the same as in Figure 5, B4galnt2 + / '+ =wt and 
B4galnt2—/— =ko). *ko6 duodenal mucosal sample was not 
available for analysis (see Results). 

disease-causing organisms and/or change specific 
binding targets for GI pathogens. For example, even 
in this study intended to characterize the symbiotic 
microbiota, we detected a difference in the patho- 
genic genus Helicobacter, which is known to adhere 
to other host blood group carbohydrate structures 
during pathogenesis. 

In summary, we have found that B4galnt2 expres- 
sion influences the intestinal microbiota. Helicobacter 
and other enteric pathogens known to interact with 
GalNAc residues, including protozoans [Entamoeba 
histolytica (Frederick and Petri, 2005)), viruses 
(Norovirus (Shirato et al., 2008)) and a spectrum of 
bacteria (Krivan et al., 1988; Karlsson, 1995), are 
excellent candidates to underlie the striking signa- 
tures of selection at the B4glant2 locus in wild-mouse 
populations. 
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